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A  problem  of  interest  in  genetics  is  that  of  testing  whether  a  mixture  of  two  binomial 
distributions  Bi{k,p)  and  Bi(k,  1/2)  is  simply  the  pure  distribution  B{(k,  1/2).  This 
problem  arises  in  determining  whether  we  have  a  genetic  marker  for  a  gene  responsible 
for  a  heterogeneous  trait,  that  is  a  trait  which  is  caused  by  any  one  of  several  genes.  In 
that  event  we  would  have  a  nontrivial  mixture  involving  0  <  p  <  0.5  where  p  is  a 
recombination  probability. 

Standard  asymptotic  theory  breaks  down  for  such  problems  which  belong  to  a  class 
of  problems  where  a  natural  parameterization  represents  a  single  distribution,  under  the 
hypothesis  to  be  tested,  by  infinitely  many  possible  parameter  points.  That  difficulty 
may  be  eliminated  by  a  transformation  of  parameters.  But  in  that  case  a  second  problem 
appears.  The  regularity  conditions  demanded  by  the  applicability  of  the  Fisher  Information 
fails  when  k  >  2.  We  present  an  approach  where  use  is  made  of  the  Kullback  Leibler 
information,  of  which  the  Fisher  information  is  a  limiting  case. 

Several  versions  of  the  binomial  mixture  problem  are  studied.  The  asymptotic  analysis 
is  supplemented  by  the  results  of  simulations.  It  is  shown  that  as  n  — *•  oo,  the  asymptotic 
distribution  of  twice  the  logarithm  of  the  likelihood  ratio  corresponds  to  the  square  of  the 
supremum  of  a  Gaussian  stochastic  process  with  mean  0,  variance  1  and  a  well  behaved 
covariance  function.  As  k  — *  oo  this  limiting  distribution  grows  stochastically  as  the 
square  root  of  log  k. 
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1.  Introduction. 

A  problem  of  interest  in  genetics  is  that  of  testing  the  hypothesis  that  a  mixture  of 
two  binomial  distributions  Bi(k,p)  and  Bt(k,  1/2)  is  the  degenerate  case  of  the  single 
binomial  Bi(k,  1/2).  This  problem  arises  in  determining  whether  there  is  a  marker  for  a 
genetically  heterogeneous  trait,  i.e.,  a  trait  that  can  be  caused  by  a  mutation  at  any  one 
of  several  different  loci.  For  parametric  hypothesis  testing  problems  it  is  customary  to  use 
the  generalized  likelihood  ratio  as  a  test  statistic.  Under  standard  regularity  conditions,  a 
classical  result  of  Wilks  (1938)  states  that  if  the  hypothesis  is  true,  twice  the  logarithm  of 
the  likelihood  ratio  has,  asymptotically,  a  chi-square  distribution. 

The  regularity  conditions  are  not  satisfied  for  our  mixture  problem.  Moreover,  under 
the  parametrization  ordinarily  used  for  this  type  of  problem,  the  hypothesis,  which  is 
simple  and  uniquely  determines  the  distribution  of  the  data,  corresponds  to  an  infinite 
set  of  parameter  points  designating  the  mixture  fraction  and  the  probability  p.  This 
complication  may  be  eliminated  by  introducing  an  alternative  parametrization. 

For  the  case  k  —  2  with  this  reparametrization,  most  of  the  regularity  conditions 
are  satisfied,  and  a  generalization  of  the  Wilks  result  (Chernoff,  1954)  establishes  that  the 
asymptotic  distribution  of  twice  the  logarithm  of  the  likelihood  ratio  is  a  mixture  of  three 
distributions,  two  of  which  are  those  of  chi-square  with  one  and  two  degrees  of  freedom. 
However,  for  k  >  2,  the  regularity  conditions  no  longer  apply. 

For  this  special  problem,  the  distribution  of  the  likelihood  ratio  can  be  determined  by 
simulation.  However,  asymptotic  theory  is  useful  in  understanding  generalizations  of  our 
problem.  One  generalization  is  for  the  model  where  independent  observations  correspond 
to  different  values  of  k.  Another  is  when  we  wish  to  test  that  a  mixture  of  Bi(k,pi)  and 
Bi(k,p2)  with  unknown  Pi  and  p2  is  really  a  single  binomial.  We  shall  analyze  the 
first  of  these  generalizations.  These  problems  belong  to  a  large  class  of  problems  where 
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regularity  conditions  fail,  and  include  mixture  problems  discussed  by  Ghosh  and  Sen  (1985) 
and  change  point  and  segmented  regression  problems  treated  by  Feder  (1975). 

The  main  idea  behind  our  approach  is  that  the  Fisher  Information  which  characterizes 
the  behavior  of  the  maximum  likelihood  estimate  (MLE)  degenerates  in  problems  which 
lack  regularity.  However,  the  Fisher  Information  is  a  limiting  case  of  the  Kullback-Leibler 
(KL)  Information,  which  is  the  expectation  of  the  likelihood  ratio,  and  is  the  natural 
measure  of  the  ability  to  use  data  to  discriminate  between  alternative  hypotheses.  The 
study  of  the  KL  Information  for  nearby  alternatives  clarifies  what  constitutes  appropriate 
parametrizations  and  the  asymptotic  behavior  of  the  likelihood  ratio  as  well  as  the  MLE. 

In  our  problems  we  shall  express  the  asymptotic  distribution  of  the  logarithm  of  the 
likelihood  ratio  in  terms  of  the  maximum  of  the  square  of  a  relatively  simple  Gaussian 
stochastic  process. 

In  Section  2  we  present  formal  statements  of  several  problems  and  the  appropriate 
parametrization.  In  Section  3,  we  discuss  the  asymptotic  distribution  under  the  null  hy¬ 
pothesis  for  the  case  of  k  =  2.  In  Section  4,  the  asymptotic  distribution  for  the  case  of 
arbitrary  k  is  derived.  In  Section  5,  extensions  of  these  results  are  presented  to  include 
several  values  of  k,  a  restricted  version  of  the  problem  where  p  >  1/2,  noncentral  results 
and  large  deviation  results.  Derivations  appear  in  an  appendix.  Section  6  presents  results 
of  simulations  comparing  asymptotic  and  finite  sample  distributions. 

The  main  asymptotic  result  is  that  under  the  null  hypothesis,  twice  the  logarithm  of 
the  likelihood  ratio  behaves  like  the  square  of  the  maximum  of  a  stochastic  process  in  a 
variable  <j> ,  and  for  each  value  of  <f>,  the  process  has  mean  0  and  variance  1.  The  problem 
discussed  here  is  a  special  case  of  a  more  general  theory  which  applies  to  mixture  problems 
and  change  point  problems.  The  most  general  discussion  to  data,  one  which  applies  to 
mixture  problems  and  would  include  our  problem  as  a  special  case,  is  due  to  Ghosh  and 
Sen  (1985). 


2 


2.  Problem  Statements  and  Reparametrization. 

We  present  a  formal  statement  of  the  problem  and  several  variations.  In  what  follows 
£{X)  stands  for  the  distribution  (law)  of  X  and  £(X\ Y)  is  the  conditional  distribution 
of  X  given  Y.  The  distribution  and  expectation  for  a  given  value  of  the  parameter  6 
are  represented  by  £e  and  Eg .  The  binomial  distribution  corresponding  to  k  trials  with 
probability  p  is  designated  by  Bt(k,p)  and  N(p,  E)  represents  the  normal  distribution 
with  mean  p  and  covariance  matrix  S.  The  chi-square  distribution  with  m  degrees  of 
freedom  is  written  £{x2m)- 

PROBLEM  1.  Let  Xx,X2, Xn  be  i.i.d  observations  on  a  random  variable  X  for 
which  the  distribution 

£{X)  =  aBi[k,p)  +  (1  -  a)Bi(k,  1/2)  (2.1) 

where  a  and  p  are  unknown  and  0  <  a  <  1  and  0  <  p  <  1.  The  hypothesis, 
Ha  :  p  =  1/2  or  a  =  0,  is  tested  using  the  likelihood  ratio  test.  Assuming  H0  is  true 
what  are  the  distributions  of  the  likelihood  ratio  and  the  maximum  likelihood  estimates 
of  a  and  p? 

PROBLEM  2.  Let  Xx, X2,...,Xn  be  independent  observations  where 

£{Xt)  =  aBi(ki,p)  +  (1  -  a)Bi[ki,l/2)  , 

the  ki  are  known,  nk  =  n\k  is  the  number  of  times  k{  —  k,  EAfc  =1,  0  <  p  <  1  and 
0  <  a  <  1.  The  hypothesis,  Ha:p=  1/2  or  a  =  0,  is  tested  as  in  Problem  1. 

PROBLEM  3.  Consider  the  variations  of  Problems  1  and  2  where  p  is  restricted  by 
0  <  p  <  0.5.  These  variations  are  those  most  relevant  for  the  application  to  genetics. 
There,  p  represents  the  recombination  fraction  of  a  proposed  marker  to  one  of  the  loci 
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for  the  trait  and  achieves  a  maximum  value  of  1/2  when  there  is  no  linkage.  Furthermore 
k  represents  the  size  of  the  family  studied  and  may  vary  from  one  family  to  another.  We 
will  refer  to  these  variations  as  the  one  sided  cases. 

PROBLEM  4.  Consider  the  variation  of  Problem  1  where 

£{X)  =  aB*(ktPl)  +  (1  -  a)Bi(k,p2) 

with  pi,p3  and  a  unknown,  0  <  pi  <  1,  0  <  pa  <  1,  0  <  a  <  1.  It  is  desired  to  test 
the  hypothesis  H0:  pi  =  p2  or  a  =  0  or  1. 

Under  H„  in  Problem  1,  t{X)  =  1/2).  The  same  distribution  applies  when 

p  =  1/2,  no  matter  what  the  value  of  a  is,  or  when  a  =  0,  no  matter  what  the 
value  of  p.  In  effect  the  hypothesis  Ha ,  which  corresponds  to  {(a,p) :  a  =  0,  0  < 
p  <  1}  U  {(a:,p)  :  p  =  1/2,  0  <  a  <  1},  really  corresponds  to  only  one  point  in  the 
space  of  distributions.  Thus  a  more  “natural”  parametrization  should  have  only  one  point 
correspond  to  the  above  set.  We  offer,  as  an  alternative,  the  following  parametrization 
which  will  prove  convenient.  Let  0  =  (Oi,02)t  (the  exponent  T  is  used  for  transpose) 
where 

«s=a(p-l/2)J  (2.2) 

then 

a  =  4  el/02  (2.3) 

and 

P=  ^(1  +  ^)  =  (1  +  ^)/2  (2-4) 

where 

<f>  =  02/0l  =2p  — 1  .  (2.5) 
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In  the  new  parametrization,  Ha  corresponds  to  0  =  0  or,  equivalently,  0X  =  02  =  0. 
The  range  of  (0X ,02)  lies  between  the  line  02  =  0X  and  the  parabola  02  =  A0\  for 
0  <  6 1  <  1/4  and  between  02  =  —  0X  and  $2  =  A0\  for  —1/4  <  0X  <  0.  Note  that 
02  >  0  and  <f>  ranges  from  -1  to  1,  and  has  the  same  sign  as  0X. 

In  terms  of  the  new  parametrization,  the  probability  density  of  X  is 


'“-■©{fKro-sr*  (-£)}• 


x  =  0, 1,  ...,k 


/(x,0)  =2~k  ,  x  =  0,l ,...,k  . 


The  likelihood  ratio  is 


„(X,«)  =  1  +  «(*•)  =  1  +  f  {(l  +  (l  -  eifX  - 1}  (2.6) 


We  may  write 


where 


PtM  . 

<P 


is  a  polynomial  of  degree  k  —  1  in  <j>. 
The  logarithm  of  the  likelihood  is 


<n(/(X,0)//(X,O)]  =  (n[l  +  «(X«)1- 


The  Fisher  Information  is  defined  by 


'"-Ml^Pg*4]’) 


where  d/dO  represents  the  column  vector  whose  components  are  the  partial  derivatives 
with  respect  to  the  components  of  0.  The  Kullback  Leibler  Information  is 


K(t,r)  =  E.{<n|/(-M)//(X  *')]}• 


(2.10) 
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3.  Asymptotic  distribution  of  likelihood  ratio  when  k  =  2. 

For  Problem  1,  the  case  k  =  2  becomes  relatively  simple.  There,  we  have 

/(O,0)  =0.25-20!  +03 
/(1,0)  =0.5-203 

/(2,0)  =  0.25  +  20!  +03  (3.1) 

We  observe  data  on  a  multinomial  distribution  with  three  probabilities  depending  linearly 
on  two  parameters.  The  Fisher  Information  is  easily  calculated  and 

'<°>=(o  y  <«> 

A  __ 

Let  0U  be  the  MLE  unrestricted  by  the  restrictions  on  the  range  of  0.  Standard  theory 
tells  us  that  when  0  =  0,  L0{y/nOu)  — ►  N(0,  J_1(0)).  Applying  Chernoff  (1954)  with  the 
restriction  on  0,  it  follows  that  L  =  In  (likelihood)  satisfies 

•Co (2i)  —  i£(x?)  +  2A-£(x5)  +  (i  -2A)£(i?|0<yj  <  v^y,)  as  n  — *  oo,  (3.3) 

where  £(Xm)  is  the  chi-square  distribution  with  m  degrees  of  freedom,  Yx  and  Y2 
are  independent  N( 0, 1)  random  variables,  and 

A  =  i-  arctan(l/\/2)  =  0.098  (3.4) 

Alternatively,  we  may  write 

P0{2L<x}->  [$(v/£)-i]  +  2A[l  -  c~*/2]  +2^  [$(>/«)-  (3.5) 

where  <f>  and  $  are  the  density  and  cdf  for  the  N( 0,1)  distribution.  Some  detail  is 
presented  in  Appendix  1. 

In  Problem  3,  a  similar  analysis  involves  a  further  restriction  on  0.  There,  0  is 
restricted  to  the  right  half  of  the  range  of  0  for  Problem  1.  Here 
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(3.6) 


£„(2L)  \ C(X\)  +  A£(x?)  +  (i  -  A)£(0) 

where  £(0)  is  the  distribution  which  attaches  probability  one  to  the  value  0. 


4.  Asymptotic  distribution  of  likelihood  ratio  for  arbitrary  k. 

By  relating  the  likelihood  ratio  to  its  expectation,  the  Kullback  Leibler  Information, 
we  shall  show  that  for  specified  <f>  =  02 /6 1 ,  twice  the  logarithm  of  the  likelihood  ratio  is 
simply  expressed  in  terms  of  S[Z,<f>)  where  S(Z,  <f>)  is  a  polynomial  in  <j>,  linear  in  Z, 
and  Z  is  an  asymptotically  normal  vector  random  variable.  The  resulting  characterization 
of  the  distribution  of  the  likelihood  ratio  involves  maximizing  with  respect  to  <j>. 

4.1  The  Kullback  Leibler  Information. 

First  let  us  evaluate  the  KL  Information  based  on  a  single  observation  for  Problem  1. 
Let 

El  =  *(*,«)  =  ^{(l  +  <>)x(l-*)‘"*  -l}=4«1P,.W  (4.1) 

Then,  for  0  =  o(l), 

K( 0, 0)  =  E0{-tn{  1  +  U)}  =  E0  {-U  +  V  }  +  o{0\) 


But  we  can  see,  without  calculating  that 


and  hence 


E°w=E°{w%-i}=0’ 

£.{(l  +  4l)Jt(l-4l)‘-J'}  =  l. 


However,  to  evaluate  E0(U7),  we  must  calculate 


e„{(i + *rx  (i  -  tr1-*'  }  =  £*■*(*)  [<i + *)”(i  - 
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Thus 


E.(IP)  =  ^ [(1  +  ^2)‘  -  l]  =  lMlPkkW) 

And 

K(O,0)=&OlPkk(<f)+o(6>1).  (4.3) 

Applying  the  additivity  of  KL  Information,  we  have 

LEMMA  1.  For  Problem  1,  the  KL  Information  K(0,0)  is  8n[$lPkk(<f>2)  +  0(0*)].  For 
Problem  2,  it  is  %n[0\  EA*  pfcfc  (<£2)  +  o(0j)j. 

The  form  of  if  (0,0)  suggests  the  importance  of  0i  in  our  reparametrization.  For  k  — 
2,K(O,0)  «  160?  +80£  =  ^0TJ(O)0.  However,  for  k  =  3,  K( 0,0)  m  24 0\  +24 0\  +8 0*/0\ 
which  does  not  behave  well  as  Oi  and  03  approach  zero.  The  lack  of  regularity  for  k  =  3 
is  associated  with  this  “poor  behavior.” 

4.2  The  likelihood  ratio  for  Problem  1. 

Let  us  assume  in  Problem  1  that  M}  of  the  n  observations  have  Xi  =  j.  Then  M  = 
(M0 ,  Mi , . . . ,  Mk  )T  has  a  multinomial  distribution,  and  from  the  central  limit  theorem  it 
follows  that  Mj  =  n/(j,0)  +  y/nZj  where 

£0[Z)  — *•  iV(0,  E)  as  n  —*•  oo  (4.4) 

and 

S  =  \Wu\\  =  ll/M)*-y -/(*', 0)/(j, 0)||,  ij  =  0,1,...,*  (4.5) 

with  6i}  =1  if  i  =  j  and  0  otherwise.  The  logarithm  of  the  likelihood  ratio  is 
L  =  sup  L(0)  where,  for  0  —  0(n“ lf2 ) 

L(0)  =  'L{nf(j,O)  +  VHZi}ln[l+u(j,O)} 

=  s{n/(j',0)[u(j,«)  -  \u'(j,0)}  +  y/nZj\i(j,0)}  +  op(l)- 
=  -80\Pkk(4>2)  •  n  +  4y/n0iS(Z,<j>)  +  op(l) 
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where 


S(Z,4>)  =  LZjpk}{4>).  (4.6) 

For  fixed  <f>,  the  maximum  of  the  main  terms  of  L(6)  is  attained  at 

=  S(Z,<j>)/4y/nPkk(<i>2).  (4.7) 


However,  since  02  >  0,  is  restricted  to  have  the  same  sign  as  <£,  and  the  maximum 
over  this  restricted  range  is  ^ T2(<f> )  where 


T{4>)  =  max  0, 


sgn.(<t>)S(Z,<j>)' 

VK7W)  J 


(4.8) 


Thus 


and  we  have  established 


2 L  =  sup  T2(<l>)  +  op(l). 
I*l<i 


(4.9) 


THEOREM  1.  For  problem  1,  the  asymptotic  distribution  of  twice  the  logarithm  of  the 
likelihood  ratio  is  that  of  sup^^j  T7  where  T(<j>)  is  defined  in  (4.8)  with  £a(Z)  replaced 
by  N{ 0,1!). 

4.3  The  Stochastic  process  S(Z,  <f>). 

In  Problem  1,  the  limiting  distribution  of  L  under  H0  is  determined  by  the  fact  that 
£„(Z)  N( 0,  £)  as  n->oo.  By  a  continuity  argument,  this  limiting  distribution  may 
be  obtained  by  assuming,  as  we  shall  in  this  subsection,  that  £(Z)  =  JV( 0,  £).  Then  the 
stochastic  process  S(Z,  <f>)  is  a  Gaussian  process  which  may  be  expressed  as  a  polynomial 
of  degree  k  —  1  in  <f> ,  i.e., 

k  k 

S(Z,*)  =  <4-10) 

}  =  0  i=  1 

The  distribution  of  S  is  described  in 
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PROPOSITION  1.  The  coefficients  are  independent  normal  random  variables  with 
mean  0  and  variance  (*).  The  process  S(Zt<f>)  has  mean  0  variance  Pkk(<j>2)  and 
autocorrelation 

/»(*!.**)  =P«Wi^i)/[P«(^)Afc(^)]I/a.  (4.11) 


Proof:  The  Wi  are  linear  functions  of  the  Zi  and  hence  have  a  multivariate  normal 
distribution  with  mean  0  .  Also  E(S(Z,<f> ))  =  0.  Then 

k  k 

ElSiZ'tjSiZ'+z)}  =  Y,  P*<(* =  E  K'EiWjwM'1  (4.12) 

»,/=  o  »',i=  i 

But 


/(<M  = 

*.i  » 

-  [Eft. (*.)/(!. 0)] 

E  ft.  w/(‘.°)  -  (*)  [(1 + m 11 ;  *Y"  ~ f]  -  o 

E  ft.  «.  )ft.  «,  )/(*•.  °)  =  E  2‘  ( •)  [(1+*l)'(1~*':r'~I] 


(l  +  ^VU-^)*-^!- 


r  a 

_  1  |  ^(1  +  0x)(l  +  #2)  +  (1  ~  <^0(1  ~  jM  jfc  _  j  j 


—  -Pfcfc  (^1^2) 


The  representation  (4.11)  for  the  autocorrelation  follows.  Equating  coefficients  in  (4.12), 
it  follows  that  EWjW i  =0  if  :  /  j  and  JE7W?  =  (*). 

One  consequence  of  Proposition  1,  is  that  the  stochastic  process  S(Z,<f>)/ y/Pkk  {4>2 ) , 
which  is  so  important  in  T(<j>),  is  (asymptotically)  a  Gaussian  process  with  mean  0  and 
variance  1  for  each  value  of  <j>. 
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5.  Generalization,  extensions  and  comments. 

In  this  section  we  point  out  that  the  results  for  Problem  1  extend  easily  to  Problems  2 
and  3.  These  results  are  associated  with  the  behavior  of  the  maximum  likelihood  estimates 
of  the  parameters  of  the  model  which  resembles  that  of  easily  derived  the  moment  method 
estimates.  A  geometric  interpretation  of  these  results  is  presented. 

Problem  4  may  be  treated  along  similar  lines  requiring  a  more  complex  analysis  which 
will  not  be  presented  here. 

Results  are  presented  for  the  noncentral  case  where  the  hypothesis  is  almost  true  and 
for  the  restricted  problem  where  the  alternative  hypothesis  restricts  p  to  be  substantially 
removed  from  0.5.  The  latter  problem  is  relevant  for  the  genetic  application  where  we 
expect  a  strong  linkage,  if  any,  between  the  marker  and  one  of  the  loci  involved. 

Finally,  it  is  pointed  out  that  the  logarithm  of  the  likelihood  approaches  infinity  as 
k  — *  oo.  The  rate  of  convergence  is  very  slow,  of  order  of  magnitude  of  (log  jfc)1/2 . 


5.1  Problems  2  and  3. 

A  straightforward  extension  of  the  argument  for  Problem  1,  yields 


THEOREM  2.  If  nk  /n  — >  Xk  as  n  —*  oo  it  with  £Afc  =  1,  the  limiting  distribution 
of  2 L  under  H0  in  Problem  2  is  £  sup^^  r  {T2  ($)}j  where 


m  =  max  [0,  "I'*'1]  (5.1) 

and  P(<j>2)  =  ]£  AKpfcfc(<£3)  and  the  Wki  are  independent  N(0,  (*))  random  variables. 

For  Problem  3,  the  only  change  in  the  results  for  Problems  1  and  2  is  that  the  supre- 
mum  of  T7  ( <f> )  should  be  taken  over  the  domain  —  1  <  4>  <  0  which  corresponds  to 
0  <  p  <  0.5. 


Problem  4  is  more  complicated  to  handle  because  it  involves  3  parameters.  We  will 
not  elaborate  on  it  here,  but  it  is  subject  to  a  similar  analysis  where  the  first  stage  of  the 
maximization  of  the  likelihood  is  with  pi  —  p3  kept  fixed  in  place  of  <f>. 
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5.2  Maximum  likelihood  and  moment  method  estimates. 

Returning  to  Problem  1,  the  results  are  associated  with  an  implicit  characterization 
of  the  MLE  estimates.  Here  4>  is  the  value  of  <f>  which  maximizes  T2  (<£), 


1  SjZj) 

Pu*{P) 


+  °p{n~112) 


and 

l4yJPkk(^y 

While  §i  =  Op(n-1/2),  the  estimate  <f>  varies  between  -1  and  +1.  This  is  not  especially 
surprising  since  <f>  is  not  identified  when  Hn  holds. 

Appendix  2  presents  a  derivation  of  the  asymptotic  properties  of  the  unrestricted 
moment  method  estimator  0U  of  0.  This  derivation  is  relatively  simple  and  the  results 
were  useful  originally  in  clarifying  the  situation  in  Problem  1.  It  is  seen  that  Lo{y/n0u)  = 
W(0,Eu)  where  £u  is  a  diagonal  matrix  with  diagonal  entries  (16fc)_1  and  (8fc(fc-l))-1. 
It  follows  that  the  =  02u/0lu  has  a  limiting  Cauchy  distribution. 

There  does  not  seem  to  be  a  standard  convention  for  modifying  0U  to  0  in  order 
to  conform  with  the  restriction.  It  seems  natural  to  use  some  sort  of  projection,  in  which 
case  4>  would  behave  like  the  mixture  of  a  truncated  Cauchy  with  a  probability  of  0.5  at 


<j>  =  0. 


5.3  A  geometric  interpretation. 

Appendix  1  and  Section  4  present  the  results  for  k  =  2  in  Problem  1  in  different 
ways.  A  geometric  interpretation  of  the  results  of  Section  4  relate  these.  The  expression 


Ti  {<f>)  =  sgn(^) 


S(Z,<f>) 


=  sgn  (<£) 


can  also  be  written 

TM)  =  X>,9.W)  =  YT,/Vf-g 

t=  1 
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where  the  components  of  g  are 


&(<£)  =  sgn (<£)  ft  1  ,  i  =  1,2, . . . ,  k 

and  t(Y)  =  N(0, 1).  Thus  if  Tx  {ft)  is  positive,  it  may  be  interpreted  as  the  length  of 
the  projection  of  Y  onto  the  ray  from  the  origin  through  g. 

Consider  the  “cone”  made  up  of  all  of  these  rays  as  <j>  varies  from  -1  to  +1.  This  cone 
is  a  two  dimensional  surface  in  k  dimensional  space.  For  k  =  2  ,  that  surface  is  simply 
the  angle  between  the  horizontal  axis  and  the  line  with  slope  l/y/2  and  the  reflection 
of  that  angle  about  the  vertical  axis.  Our  asymptotic  expression  for  T2  is  the  squared 

A 

length  of  the  projection  of  Y  onto  this  surface.  The  MLE  <f>  corresponds  to  the  ray  on 

A 

which  the  projection  falls  and  ftnOx  is  the  ratio  of  the  length  of  the  projection  to  four 
times  the  length  of  g{ft). 


5.4  The  noncentral  case. 

In  regular  problems,  the  limiting  chi-square  distribution  of  the  Wilks  result  becomes 
a  noncentral  chi-square  when  the  hypothesis  is  false  but  the  true  value  of  the  parameter 
is  close  to  the  set  of  parameter  values  under  which  the  hypothesis  is  true.  In  Appendix  3 
we  derive  the  following  analogous  result  for  Problem  1. 


THEOREM  3.  In  Problem  1  let  the  true  value  60n  of  the  parameter  be  n  1^{B\,0*7)T 
for  fixed  0\ ,  and  =ftB\.  Then 


where 


£eo.(2L)-.c{Sup 


r-W=max[0,B8nWS'(^^ 

k 


*-  l 
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and  the  W*  art  independent  with 


When  the  true  value  of  0  is  0O  which  is  substantially  different  from  0,  we  have 
what  may  be  called  the  large  deviation  ease.  Then  the  distribution  of  the  logarithm  of  the 
likelihood  behaves  like  N(Nk(0o,O),nV(0o))  where 

K(«o,0)  =  E.,  {£n[/(X,«o)//(X,0)l} 

and 

V(«„)  =  Var {£n[/(A-,#„)//(Ar,0)]}. 

5.5  The  restricted  problem. 

If  we  restrict  p  to  be  in  the  interval  0  <  p  <  p*  <  0.5,  then  — 1  <  4>  <  <t>‘  — 
2 p*  —  1  <  0,  and  we  have 

£.(2£) 

For  the  special  case  k  =  2,  the  analysis  of  Appendix  1  is  easily  extended  to  yield  the 
closed  form  limiting  distribution, 

£0(2 L)  |£(x?)  +  A(*-)£(x?)  +  (5  - 

where 

A{^>*)  =  |arctan(l/\/2)  -  arctan(-^*/v/2)] /27t 

is  the  angle  between  <fi  =  <£*  and  <f>  =  —  1  after  a  normalizing  transformation  of  the 
( 0i,02 )  space. 
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5.6  Limiting  behavior  as  Jfc  — >  oo. 

In  the  limit  as  n  — ►  oo,  we  deal  with  the  supremum  of  a  Gaussian  process  with  mean 
0,  variance  1  and  the  autocorrelation  function  p{fa,fa)  in  (4.11).  We  shall  show  that 
as  k  gets  large  neighboring  values  of  the  stochastic  process  becomes  almost  independent 
and  hence  the  supremum  resembles  that  of  many  i.i.d.  JV(0, 1)  random  variables  and 
approaches  infinity.  As  we  shall  show,  this  approach  is  of  the  order  of  (log/:) 1/2  which 
grows  very  slowly  with  k. 

To  demonstrate  the  almost  independence,  consider  fa  and  fa  =  fa  +  6  where  fa , 
is  bounded  away  from  0,  kS2  — »•  oo  while  kS3  — ►  0,  e.g.  k  =  3/3 .  Then  a  careful 
expansion  yields 

_ kS 2 

in  p2  {fa  ,  fa)  =  ■— --2jT  +  o(l)  —  -oo 

and  p{fa,fa  +  6)  — >  0. 

This  analysis  suggests  the  transformation  u  =  —k1,2<j>  for  —  1  <  <}>  <  0,  so  that  as 
k  oo 


pty  hfa)  — 


(l  +  fafa)k~l 


eu,Ua  -  1 


{[(i  +  <t>2)*  -  i][(i  +  fa)*  -  i\y/2  [(«•;  - 1)(«-5  - 1)]^/3 


=  Pl  (Ux,u2) 


where 

Px(u!  +  a,u2  +  a)  -*■  p2(ui  -  u2)  =  e~  (5.2) 


as  a  — *  oo.  Our  stochastic  process  converges  in  law  to  one  with  autocorrelation  function 
Pi .  Here  p2  is  the  autocorrelation  function  of  a  stationary  Gaussian  stochastic  process 
for  which  the  asymptotic  properties  of  the  supremum  are  well  known.  In  Appendix  4 
we  show  that  this  supremum,  which,  over  the  interval  0  <  u  <  k1/2 ,  is  (log  it) 1/2  + 
Op (l)(log  k)~ 1/2 ,  dominates  stochastically  that  of  our  limiting  process  determined  by  px . 

Two  points  are  worth  noting.  First,  with  large  probability  the  supremum  for  the 
process  corresponding  to  px  takes  place  for  large  u,  and  is,  in  the  limit,  stochastically 
equal  to  that  for  p2 .  Second,  our  asymptotics  involves  a  double  limit.  First  we  let  n  — *  oo 
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and  then  k  -*  oo.  In  practice  both  k  and  n  are  finite,  and  for  large  k,  very  large  n 
is  required  for  the  asymptotic  normality  to  be  meaningful  for  4>  somewhat  distant  from 
0,  i.e.,  p  removed  from  0.5.  We  shall  not  elaborate  on  this  point  except  to  remark  that 
informal  calculations  suggest  that  our  approximations  through  the  asymptotic  theory  lead 
to  estimates  of  the  quantiles  of  2 L  that  are  conservative  in  the  following  sense.  They  are 
somewhat  larger  than  for  finite  k  and  n  and  their  use  would  suggest  larger  P  values 
than  the  true  values. 

6.  Simulations 

In  this  section  we  compare  various  asymptotic  and  finite  sample  distributions  for  twice 
the  logarithm  of  the  likelihood  ratio.  With  a  few  exceptions  for  k  =  2,  the  asymptotic 
distributions  are  not  expressed  in  simple  closed  form  and  hence  were  calculated  by  simula¬ 
tion.  We  present  various  estimated  quantiles  for  each  distribution.  The  calculations  were 
based  on  10,000  simulations  each,  and  so  the  standard  deviation  of  the  estimate  x,  of  the 
quantile  x,  would  be  (0.01) (g(l  —  g)//a  (x,)]1/2 .  Since  the  density  of  /(x,)  is  ordinarily 
not  known,  an  interested  user  of  the  tables  could  crudely  estimate  it  from  the  table.  Since 
the  finite  sample  distributions  are  actually  discrete,  there  is  some  indication  of  granularity 
for  small  n  and  q  close  to  one.  Table  1  does  not  reveal  this  granularity  and  its  effect  on 
the  standard  deviation  of  our  estimated  quantiles,  for  which  coarse  approximations  can  be 
inferred  from  the  limited  data  presented  here.  From  a  sample  of  10, 000, only  crude  results 
can  be  expected  for  estimating  the  0.999  quantile  based  on  the  10  largest  observations,  an 
estimate  which  usually  has  a  relatively  large  standard  deviation. 

Table  1  presents  the  quantiles  and  estimated  quantiles  in  the  following  order.  First 
the  asymptotic  case  (n  =  oo)  for  the  one  sided  problem  (s  =  1,  0  <  p  <  .5)  is  tabulated. 
Then  a  couple  of  cases  for  the  two  sided  problem  (s  =  2)  are  listed.  Thereafter  all  entries 
correspond  to  the  one  sided  case.  The  finite  sample  results  are  listed.  A  few  examples  of 
the  mixed  case  with  two  values  of  k  are  given.  In  the  asymptotic  version  the  value  of  n 
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is  replaced  by  the  ratio  Ax/A2.  In  the  finite  sample  version  nx,n3  is  used  for  n  and  in 
both  cases  :  k2  is  given  for  k. 

The  noncentral  asymptotic  case  lists  the  value  of  p  and  6 \  =  y/n9 x .  For  the  finite 
sample  versions,  the  corresponding  values  of  a  are  also  listed.  Then  the  restricted  case 
is  tabulated  in  the  asymptotic  and  finite  sample  size  case.  The  table  concludes  with  the 
quantiles  for  chi-square  with  one  and  two  degrees  of  freedom  and  (.01)[g(l  —  g)]1/2  to 
help  compute  standard  deviations. 

The  entries  in  Table  1  were  culled  from  a  more  extensive  list  of  simulations  to  provide 
the  reader  with  some  ability  to  make  meaningful  comparisons  without  the  benefit  of  sensory 
overload. 

The  asymptotic  results  of  Table  1  show  that  2 L  grows  slowly  with  k.  For  k  =  40, 2 L 
for  the  one  sided  problem  is  still  stochastically  smaller  than  chi-square  with  2  degrees  of 
freedom.  For  the  two  sided  problem  2 L  tends  to  be  little  larger,  which  is  to  be  expected. 
The  finite  sample  estimates  give  lower  quantiles  than  the  asymptotic  values.  Thus  the 
asymptotic  values  are  conservative  in  the  sense  that  they  would  lead  to  overestimating  the 
P  values  of  the  test.  The  values  of  k  and  n  in  our  simulations  are  not  sufficiently  large 
to  exhibit  the  asymptotic  behavior  for  large  k  except  in  the  crudest  fashion. 

Except  for  the  unreliable  0.999  estimated  quantile,  the  mixed  distribution  is  remark¬ 
ably  insensitive  to  the  mixture  proportions,  yielding  quantiles  close  to  those  of  the  unmixed 
larger  k  values. 

For  the  noncentral  asymptotic  distributions  a  difference  in  the  value  of  p  seems 
to  lead  to  an  effect  which  is  largely  that  of  a  translation.  For  k  =  2,  the  asymptotic 
distribution  is  a  good  approximation  for  sample  sizes  of  n  =  20  and  40.  However  for 
k  =  10,  the  asymptotic  distribution  has  2 L  much  larger  stochastically  than  it  should 
be  for  the  sample  sizes  n  =  20  and  40.  This  effect,  which  suggests  an  unduly  optimistic 
estimate  of  the  power  of  the  test,  is  due  to  the  fact  that  the  central  limit  theorem  takes 
effect  rather  slowly  when  k  is  substantial  and  p  is  far  from  .5.  Then  —<j>  is  relatively 
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large  and  S(Z,  4>)  tends  to  have  a  large  kurtosis.  Presumably  this  approximation  would  be 
good  for  very  large  sample  sizes,  much  larger  than  appropriate  for  our  genetic  application 
when  k  =  10. 

Finally,  for  the  restricted  problem,  and  the  values  of  p,  the  asymptotic  distribution 
(under  the  hypothesis)  seems  to  be  as  expected,  stochastically  smaller  than  that  for  the 
unrestricted  case.  However  this  reduction,  in  not  large  and  becomes  a  little  larger  as  the 
sample  size  diminished. 

Table  2  presents  K(0o,Q)  and  V(0O)1/3  which  can  be  used  with  a  central  limit  theo¬ 
rem  approximation  to  estimate  the  quantiles  for  the  finite  sample  version  of  the  noncentral 
distribution.  In  fact  the  highly  right  skewed  distribution  of  2 L  is  such  that  the  normal 
approximation  is  not  very  good  in  the  right  tails.  In  the  more  crucial  left  tails,  it  is  much 
better,  but  still  leads  to  conservative  estimates  of  the  power  function  by  overestimating  the 
error  probability  of  tests  for  the  noncentral  case.  This  overestimate  is  due  to  the  natural 
truncation  of  the  real  distribution  of  2 L  at  0. 

APPENDIX  1.  Details  on  the  case  k  =  2. 

If  we  introduce  the  transformation  0\  =  y/320i  and  02  =  402 ,  then  the  unrestricted 
MLE  0*  of  O'  satisfies 

Lo(V^K) N{o,i) 

but  the  set  of  possible  values  of  0 '  is  the  union  of  the  region,  between  0*  =  0J/\/ 2 
and  02  =  0\2 1 2  for  0  <  0j  <  \/2>  and  its  reflection  about  the  02  axis.  In  the 
neighborhood  of  O'  =  0,  this  set  is  approximated  by  the  region,  between  0'2  =  0\/>/2 
and  03=0  for  0\  >  0,  and  it  reflection.  Then  2 L  behaves  like  the  difference  between 

_  A  __  A 

the  squared  distance  of  y/nO*u  to  the  origin  and  to  the  above  set.  When  \Jn02u  <  0, 
this  difference  is  ( y/nO J)3  and  contributes  j£(x3)  to  the  asymptotic  distribution  of 
C0(2L).  Here  0'2  as  0  and  0{  «  0JU.  When  0*  is  in  the  restricted  region  O'  =  0* 
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A  A 

and  the  squared  distance  to  the  origin  is  6\\  +  0j2 ,  this  contributes  x\  with  probability 
determined  by  the  proportion  of  the  region  covered  by  the  restricted  region,  i.e.,  2A  where 
A  =  (2n)~1  arctan(l/\/2).  In  the  remaining  region,  the  squared  distance  is  the  squared 
length  of  the  projection  of  y/nd *  onto  the  nearer  of  the  two  lines  02  =  ±0\/y/2.  Since 
the  projections  of  y/nO*u  onto  0\  =  6\ jy/2  and  onto  the  direction  vertical  to  it  are 
independent  N (0, 1)  random  variables,  we  can  represent  this  contribution  by 

(^-2A)r(y2|o<ya  <  s/2Yx). 

Equations  (3.3)  -  (3.5)  follow  readily.  For  Problem  3,  a  similar  but  somewhat  simpler 
analysis  yields  Equation  (3.6)  after  it  is  noted  that  for  a  proportion  (1/2  —  A)  of  the 

A 

region,  the  closest  point  from  \/n0*u  to  the  restricted  region  is  at  the  origin. 

APPENDIX  2.  Estimation  of  the  parameters  by  the  method  of  moments. 

We  estimate  the  parameters  of  Problem  1  by  matching  the  first  two  sample  moments 
with  the  theoretical  moments.  We  calculate 


Hi  =  E{X)  =  akp  +  (1  —  a)k/ 2  =  k(201  +  1/2) 

=  E{X2)  =  k[ap  +  (1  -  a)/2]  +  k(k  -  1)  [ ap 2  +  (1  -  a)/4] 


=  *(20,  +  1/2)  +  k(k  -  1)(0!  <t>  +  20i  +  1/4) 
21  k  2 


02 


[  M2  ~  Hi  _  1] 

U(Jfc-l)  k  4J 


By  the  Central  Limit  Theorem 


X  =  -EX,  =  Mi  +  4= 
n 

n  y/n 


19 


where  Z{ZX,Z2)  -♦  N( 0,E*),E*  =  \\o..\\,  and  o*.  =  ni+,  -  HiH,  with  /i,  =  E (X^). 
Under  the  hypothesis  Ha,  £(X)  =  Bi(k, 1/2),  and  nx  =  k/2,  n2  =  (fc  +  fc2)/4,  /x3  = 
k2(k  +  3)/8  and  /i4  =  k(k3  +  6 k3  +3 k  —  2)/16.  Then  a\x  =  kf 4,  a\2  =  k? / 4,  and 
°22  =  &(Jt  +  l)(2fc  —  l)/8.  Applying  the  method  of  moments  and  ignoring  the  restrictions 
on  0X  and  02  yields  the  estimates 


0 


lu 


1 

2 


il 


1/2 


and 


where 


^ZX*-X  X  1  _  1 

k(k  —  1)  k  4  k(k  —  1) 


E{ZX  ( Z2  —  Kzx)}  =  0* 2  —  =  0  and  hence 


(Z2  -  Kzx)n 


1/2 


£(y/n$u)  -*  W(0,  Eu) 


with  Eu  a  diagonal  matrix  with  diagonal  elements  o*xx/4 k3  —  (16fc)-1  and  (o*2  — 
kol2) / k?  (k  —  l)3  =  [8k(k  — 1)]~ 1 .  Thus  ^u=0au/0iu  has  a  limiting  Cauchy  Distribution 
with  scale  parameter  a  where  a 3  =  2(Jfc  —  l)-1. 


APPENDIX  3.  The  noncentral  case. 

Let  the  true  value  of  0  be  0On  =  n~l^2(0‘x  ,0*2)  where  02  =  4>'0\-  We  recapitulate 
the  derivation  of  Section  4  for  this  case.  We  have 


=  nf  (/,  0On)  +  y/nZ,- 

where  once  again,  as  in  (4.4),  £9o„  (Z)  —*  N( 0,  E)  as  n  — >  00.  Then,  for  0  =  0(n" 1^2), 


L{0)  ~  +  VnZ})tn{l  +  u(j, 0)) 

3 

=  5^{n/(y,^0»)[u(i,tf)  -  ^u3C/,0)]  +  y/nZ,u{j,0)}  +op(l). 
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The  first  part  of  the  above  expression  being  summed  is 


A,  -  n2-‘  Q  {l  +  («•)}{«,?*,  W)  -  Wj  ?,?,(*)}, 

the  sum  of  which  can  be  expressed  in  terms  of  sums  of  the  form 

=  £VJ  Q)  (1  +  *•)"  (!  +  W'  (l  -  *•)*•'  (!  -  *)‘lw) 
=  {[(1  +  f)'(  1  +  +  {1  -  **)'(»  -  <fr)*]/2}‘ 


where  r  =  0  or  1  and  s  =  0, 1,  or  2  .  The  two  terms  of  A}  without  0*  were  already 
summed  in  the  previous  derivation  and  contribute  — 8n0jPtfc  (^2).  The  remaining  terms 
yield 
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—  ®10  —  <*01  +  Uoo 


]  —  320J  B\  y/n 


(oia  ~  Q02)  ~  2(an  —  Oqx)  +  (flxo  —  Qqo)  j 

J 


But  Ooo  =  a10  =  aol  =  1  and  with  some  calculation  we  have 


\/**{l60J  0x  Pkk  0 i> *  <f>)  -  320;  $\  Qk  (<p ,  *)} 


where 


Qk{4>\4>)  =  {[(1  +  <f+  2^‘)fc  -  (1  +  4>3)k]  -  2[(i  +  4>‘  4>)k  - 1 


For  fixed  <f>  we  maximize,  with  respect  to  0,  subject  to  0X  <j>  >  0,  the  main  term  of 


L{4>)  =  -B\  {8nPfcfc  {<? )  +  32 0;  y/^Qk  (0* ,  <f>) }  +  4 0X  V^5‘  (Z,  <t>)  +  op  (1) 


where 


S-(Z,*)  =  S(Z,*)  + 49; 

=  +41’;  (*)(w = 
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and  the  relation  of  W  to  Z  is  the  same  as  in  (4.10)  and  the  limiting  distribution  of  W* 
is  that  of  independent  normals  with 

H =  1 V(4«;  (*)*”■''(*)) 

The  maximum  of  the  main  term  of  L(<f>)  occurs  at  =  0  or  at 

‘  4v^P„(^)  +  0(  1) 

if  5*  (Z,  has  the  same  sign  as  <£.  Then 


2L{4>)=T"{<t>)  +  op{l) 


and  Theorem  2  follows. 


APPENDIX  4.  The  case  of  k  -*  oo. 

We  shall  show  that  and  p2  defined  in  Section  5.6  satisfy  p^Uj.Uj)  >  p2(ux  -u2). 
Hence  Slepian’s  Lemma  in  Leadbetter  et  al.  (1982,  p.  156)  implies  that  the  supremum  of 
the  Gaussian  process  with  covariance  function  pi  is  stochastically  less  than  that  of  the 
stationary  Gaussian  process  with  autocovariance  function  p2  •  Then,  applying  Theorem 
8.2.7  in  Leadbetter  et  al.  (1982,  p.  171)  for  the  stationary  process  with  A2  =  —  p2(0)  =  1, 
we  have  the  supremum  over  the  range  0  <  u  <  y/k  to  be 

M  =  (log  k)'l’  +  +  “p  (|0«  : 1/2 


where 


(log  k)1 


P{X  <  x]  —  exp(— e~x). 


We  conclude  with  a  proof  of 

LEMMA  2.  Pi(ui,u2)  >  p2(ui  —  u2)  for  0<U!,0<u2. 


Proof:  First  we  note  that  p i  >  0  and  that 

p2(ui,u2)  =  p2 (ui  u2 )[l  -  -«"“*][!  “«“"*]• 
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Hence  it  suffices  to  prove  that 


(1  _  c-“‘“>)2  >  (1  -  e~“» )(l  -  e"“») 
or 

log(l  -  ~  log(l  -  e-u‘)  +  ^  log(l  -  eu») 

If  we  let  u  =  exp(y),  it  suffices  to  show  that 

9(y)  =  log  h(y)  =  log[l  -  exp(—  exp(2y))] 

is  a  concave  function  of  y  or  that  g"(y)  <  0.  We  calculate 

h'(y)  =  2exp(2y)exp[-exp(2y)]  >  0 
h"(y)  =  2h'(y)[l  —  exp(2y)] 
g'{y)  =  h'(y)/h(y) 


and 


g"(y)  =  - 


h'(y) 

h(y) 


h"(y) 

h(y) 


If  y  >  0,  h"  (y)  <  0  and  g"  (y)  is  clearly  negative.  For  y  <  0,  we  must  show  that 


(h'(y)]2  >  h(y)h"(y) 


or 


2 exp(2y)  exp(—  exp(2y))  >  2(1  -  exp(2y))(l  -  exp(- exp(2y)) 
exp(2y)  +  exp{—  exp(2y))  >  1 

Let  w  =  exp(2y).  Then  we  need 

e~v  +  v  —  1  >  0  for  0  <  v  <  1. 

In  fact  it  is  well  know  that  the  above  inequality  holds  for  ail  v. 
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Table  1.  Quantiles  of  ln(2L) 


q 

0.25 

0.50 

0.75 

0.90 

0.95 

0.99 

.999 

k 

n 

a 

P 

~8’i 

a 

asymptotic; 

0 

.06 

.80 

2.2 

3.4 

6.3 

10.7 

2 

00 

1 

0 

.17 

1.07 

2.5 

3.7 

6.5 

10.8 

4 

oo 

1 

0 

.26 

1.26 

2.9 

4.3 

7.2 

11.2 

6 

00 

1 

.01 

.42 

1.56 

3.2 

4.5 

7.4 

12.0 

10 

00 

1 

.30 

1.12 

2.54 

4.3 

5.8 

8.9 

13.4 

40 

oo 

1 

asymptotic; 

two-sides 

.23 

.77 

1.85 

3.4 

4.6 

7.6 

12.0 

2 

oo 

2 

.56 

1.34 

2.75 

4.5 

5.8 

9.1 

14.8 

10 

oo 

2 

finite 

sample; 

0 

0 

.6 

2.0 

3.1 

5.9 

8.6 

2 

40 

1 

0 

0 

.8 

2.0 

3.5 

5.8 

8.6 

2 

20 

1 

0 

.1 

.9 

3.0 

4.4 

7.0 

11.8 

10 

40 

1 

0 

0 

.7 

2.4 

3.8 

6.5 

10.1 

10 

20 

1 

0 

.1 

.9 

2.4 

3.6 

6.8 

11.7 

40 

40 

1 

0 

0 

.8 

2.2 

3.4 

6.4 

11.6 

40 

20 

1 

mixed; 

asymptotic; 

0 

.4 

1.6 

3.3 

4.5 

7.8 

14.4 

3:10 

1/3 

1 

0 

.4 

1.6 

3.2 

4.5 

7.9 

11.9 

3:10 

3/1 

1 

mixed; 

finite  sample; 

0 

.1 

.8 

2.4 

4.2 

6.5 

13.7 

3:10 

10,30 

0 

.1 

.8 

2.3 

3.5 

6.7 

11.6 

3:10 

30,10 

noncentral; 

asymptotic; 

.5 

1.8 

4.0 

6.8 

8.7 

13.3 

19.4 

2 

oo 

1 

.30 

.2 

.6 

2.1 

4.4 

7.4 

9.4 

13.9 

22.7 

2 

oo 

1 

.15 

.2 

3.3 

6.2 

10.0 

14.4 

17.2 

23.7 

31.9 

2 

oo 

1 

.30 

.4 

3.9 

7.1 

11.1 

15.4 

18.2 

25.3 

33.4 

2 

oo 

1 

.15 

.4 

10.1 

14.6 

20.2 

25.9 

29.5 

36.7 

47.6 

10 

oo 

1 

.30 

.2 

59.5 

70.3 

82.0 

92.7 

100 

114 

128 

10 

oo 

1 

.15 

.2 

45.1 

54.8 

65.4 

75.9 

82.5 

96 

111 

10 

oo 

1 

.30 

.4 

254 

277 

300 

322 

336 

363 

396 

10 

oo 

1 

.15 

.4 

noncentral; 
finite  sample; 

.5 

1.7 

3.9 

6.4 

8.8 

12.9 

19.3 

2 

40 

1 

.30 

.2 

.32 

.3 

1.4 

3.7 

6.3 

8.3 

12.1 

18.2 

2 

20 

1 

.30 

.2 

.45 

.5 

1.9 

4.2 

7.4 

9.3 

13.8 

21.6 

2 

40 

1 

.15 

.2 

.18 

.3 

1.7 

3.8 

6.6 

8.4 

13.5 

18.4 

2 

20 

1 

.15 

.2 

.26 

3.1 

5.8 

8.4 

13.4 

16.4 

21.9 

29.2 

2 

40 

1 

.30 

.4 

.63 

2.6 

5.8 

8.6 

12.6 

15.4 

20.5 

28.3 

2 

20 

1 

.30 

.4 

89 

3.1 

6.0 

9.9 

13.9 

17.1 

23.8 

30.6 

2 

40 

1 

.15 

.4 

.36 

3.5 

5.9 

9.9 

14.2 

16.4 

22.2 

31.0 

2 

20 

1 

.15 

.4 

.51 
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Table  1.  Quantiles  of  tn(2L)  cont. 


q 

0.25 

0.50 

0.75 

0.90 

0.95 

5.1 

9.1 

14.4 

19.9 

23.6 

4.8 

8.4 

13.4 

18.7 

22.7 

9.9 

16.5 

24.5 

33.4 

39.2 

7.7 

14.4 

21.7 

30.3 

35.7 

21.5 

29.0 

37.7 

46.0 

51.6 

20.5 

27.1 

34.6 

42.1 

46.8 

33.7 

45.3 

58.6 

72.0 

80.8 

27.6 

38.0 

50.2 

61.6 

69.0 

89 

111 

137 

161 

176 

156 

192 

232 

267 

291 

restricted; 

asymptotic; 

0 

.35 

.72 

2.1 

3.3 

0 

.12 

.61 

1.9 

3.1 

0 

.32 

1.4 

3.0 

4.3 

0 

.12 

.97 

2.5 

3.7 

.17 

.86 

2.2 

4.0 

5.3 

0 

.40 

1.5 

3.2 

4.5 

restricted; 

finite  sample; 

0 

0 

.6 

2.0 

3.1 

0 

0 

.8 

2.0 

3.6 

0 

0 

.5 

2.0 

3.0 

0 

0 

.5 

2.0 

2.9 

0 

0 

.1 

1.5 

2.9 

0 

0 

.1 

1.4 

2.5 

0 

0 

0 

0 

.3 

0 

0 

0 

0 

.6 

chi-square; 

with  1  df 

.10 

.46 

1.3 

2.7 

3.8 

with  2  df 

.58 

1.39 

2.8 

4.6 

6.0 

(.01)[,(1 -«)]»/* 

.004 

.005 

.004 

.003 

.002 

0.99 

.999 

k 

n 

3 

P 

-*r 

a 

P* 

30.8 

43.7 

10 

40 

1 

.30 

.2 

.32 

30.5 

40.4 

10 

20 

1 

.30 

.2 

.45 

52.1 

68.9 

10 

40 

1 

.15 

.2 

.18 

47.7 

60.6 

10 

20 

1 

.15 

.2 

.26 

62.7 

78.1 

10 

40 

1 

.30 

.4 

.63 

57.8 

72.0 

10 

20 

1 

.30 

.4 

.89 

97.8 

114 

10 

40 

1 

.15 

.4 

.36 

84.1 

98.7 

10 

20 

1 

.15 

.4 

.51 

206 

245 

20 

40 

1 

.15 

.4 

.36 

331 

383 

40 

20 

1 

.15 

.4 

.51 

6.2 

10.5 

2 

oo 

1 

.60 

5.9 

10.1 

2 

oo 

1 

.75 

7.6 

12.9 

10 

oo 

1 

.60 

6.9 

10.8 

10 

oo 

1 

.75 

8.5 

12.3 

40 

oo 

1 

.60 

8.0 

12.3 

40 

oo 

1 

.75 

6.6 

10.8 

2 

40 

1 

.60 

6.0 

10.1 

2 

20 

1 

.75 

5.7 

13.0 

2 

40 

1 

.60 

5.9 

9.4 

2 

20 

1 

.75 

6.6 

12.0 

40 

40 

1 

.60 

6.0 

10.3 

40 

20 

1 

.75 

3.2 

7.4 

40 

40 

1 

.60 

4.0 

8.5 

40 

20 

1 

.75 

6.6 

10.8 

9.2 

13.8 

.001 

.0004 
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Table  2 


Kullback-Leibler  Information  K  (00 , 0)  and 
Standard  Deviation  SD  =  [V(ffo)]1^2  of  2L 


P 

.300 

.300 

.300 

.300 

.150 

.150 

.150 

.150 

k 

-Ox 

.032 

.045 

.063 

.089 

.032 

.045 

.063 

.089 

2 

KL 

.017 

.033 

.066 

.131 

.019 

.037 

.072 

.141 

2 

SD 

.185 

.260 

.362 

.499 

.198 

.278 

.389 

.537 

6 

KL 

.057 

.108 

.205 

.395 

.092 

.163 

.288 

.51 

6 

SD 

.364 

.502 

.682 

.892 

.511 

.691 

.921 

1.20 

10 

KL 

.105 

.193 

.354 

.66 

.204 

.34 

.57 

.94 

10 

SD 

.526 

.711 

.942 

1.18 

.880 

1.14 

1.46 

1.82 

20 

KL 

.253 

.44 

.76 

1.34 

.58 

.90 

1.40 

2.17 

20 

SD 

.924 

1.20 

1.51 

1.77 

1.88 

2.32 

2.81 

3.30 

40 

KL 

.63 

1.01 

1.64 

2.73 

1.50 

2.21 

3.27 

4.85 

40 

SD 

1.72 

2.12 

2.52 

2.69 

3.94 

4.68 

5.45 

6.09 

For  comparisons  with  the  noncentral  entries  of  Table  1 

we 

note  0{ 

and 

a  for  the  following  values  of 

n 

n 

40 

20 

40 

20 

40 

20 

40 

20 

-oi 

.200 

.200 

.400 

.400 

.200 

.200 

.400 

.400 

a 

.316 

.447 

.632 

.894 

.181 

.256 

.361 

.511 
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ABSTRACT 


A  problem  of  interest  in  genetics  is  that  of  testing  whether  a  mixture  of  two  binomial 
distributions  B<(fc,p)  and  B,(fc,  1/2)  is  simply  the  pure  distribution  (k,  1/2).  This 
problem  arises  in  determining  whether  we  have  a  genetic  marker  for  a  gene  responsible 
for  a  heterogeneous  trait ,  that  is  a  trait  which  is  caused  by  any  one  of  several  genes.  In 
that  event  we  would  have  a  nontrivial  mixture  involving  0  <  p  <  0.5  where  p  is  a 
recombination  probability. 

Standard  asymptotic  theory  breaks  down  for  such  problems  which  belong  to  a  class 
of  problems  where  a  natural  parameterization  represents  a  single  distribution,  under  the 
hypothesis  to  be  tested,  by  infinitely  many  possible  parameter  points.  That  difficulty 
may  be  eliminated  by  a  transformation  of  parameters.  But  in  that  case  a  second  problem 
appears.  The  regularity  conditions  demanded  by  the  applicability  of  the  Fisher  Information 
fails  when  k  >  2.  We  present  an  approach  where  use  is  made  of  the  Kullback  Leibler 
information,  of  which  the  Fisher  information  is  a  limiting  case. 

Several  versions  of  the  binomial  mixture  problem  are  studied.  The  asymptotic  analysis 
is  supplemented  by  the  results  of  simulations.  It  is  shown  that  as  n  — *•  oo,  the  asymptotic 
distribution  of  twice  the  logarithm  of  the  likelihood  ratio  corresponds  to  the  square  of  the 
supremum  of  a  Gaussian  stochastic  process  with  mean  0,  variance  1  and  a  well  behaved 
covariance  function.  As  k  — *  oo  this  limiting  distribution  grows  stochastically  as  the 
square  root  of  log  k. 
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